{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# torch.cuda\n",
    "该包增加了对CUDA张量类型的支持，实现了与CPU张量相同的功能，但使用GPU进行计算。\n",
    "它是懒惰的初始化，所以你可以随时导入它，并使用is_available()来确定系统是否支持CUDA。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "620199616"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "#返回cublasHandle_t指针，指向当前cuBLAS句柄\n",
    "torch.cuda.current_blas_handle() "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#返回当前所选设备的索引\n",
    "torch.cuda.current_device()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<torch.cuda.Stream device=cuda:0 cuda_stream=0x0>"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#返回一个当前所选的 Stream\n",
    "torch.cuda.current_stream() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### torch.cuda.device(idx)   \n",
    "上下文管理器，可以更改所选设备。  \n",
    "参数：idx(int)表示设备索引选择。如果这个参数是负的，则是无效操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<torch.cuda.device at 0x18496ba8>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "torch.cuda.device(0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#返回可得到的GPU数量。\n",
    "torch.cuda.device_count()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#返回一个bool值，指示CUDA当前是否可用\n",
    "torch.cuda.is_available()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "设置当前设备。  \n",
    "不鼓励使用此函数来设置。在大多数情况下，最好使用CUDA_VISIBLE_DEVICES环境变量。  \n",
    "参数：device(int)表示所选设备。如果此参数为负，则此函数是无效操作。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "device = torch.cuda.set_device(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "选择给定流的上下文管理器。  \n",
    "在其上下文中排队的所有CUDA核心将在所选流上入队。  \n",
    "参数：stream(Stream)表示所选流。如果是None，则这个管理器是无效的。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<contextlib._GeneratorContextManager at 0x18496f28>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stream = torch.cuda.current_stream() \n",
    "torch.cuda.stream(stream)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "等待当前设备上所有流中的所有核心完成。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "torch.cuda.synchronize() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### torch.cuda.comm.reduce_add(inputs, destination=None) \n",
    "将来自多个GPU的张量相加，所有输入应具有匹配的形状  \n",
    "参数：\n",
    "- inputs(Iterable[Tensor])：要相加张量的迭代 \n",
    "- destination(int, optional)：将放置输出的设备（默认值：当前设备）  \n",
    "\n",
    "返回： 一个包含放置在destination设备上的所有输入的元素总和的张量。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### torch.cuda.comm.scatter(tensor, devices, chunk_sizes=None, dim=0, streams=None) \n",
    "打散横跨多个GPU的张量。\n",
    "参数：  \n",
    "- tensor(Tensor)：要分散的张量\n",
    "- devices(Iterable[int]) int的迭代，指定哪些设备应该分散张量。 \n",
    "- chunk_sizes(Iterable[int], optional)：要放置在每个设备上的块大小。它应该匹配devices的长度并且总和为tensor.size(dim)。如果没有指定，张量将被分成相等的块。\n",
    "- dim(int, optional)：沿着这个维度来chunk张量\n",
    "\n",
    "返回：包含tensor块的元组，分布在给定的devices上。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### torch.cuda.comm.gather(tensors, dim=0, destination=None) \n",
    "从多个GPU收集张量。\n",
    "张量尺寸在不同于 dim 的所有维度上都应该匹配。\n",
    "参数：\n",
    "- tensors(Iterable[Tensor])：要收集的张量的迭代。 \n",
    "- dim(int)：沿着此维度张量将被连接。 \n",
    "- destination(int, optional)：输出设备（-1表示CPU，默认值：当前设备）。\n",
    "\n",
    "返回： 一个张量位于destination设备上，这是沿着dim连接tensors的结果。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 流和事件\n",
    "### class torch.cuda.Stream：CUDA流的包装  \n",
    "参数：\n",
    "- device(int, optional)：分配流的设备。 \n",
    "- priority(int, optional)：流的优先级。较低的数字代表较高的优先级。\n",
    "\n",
    "### query()  \n",
    "检查所有提交的工作是否已经完成。  \n",
    "返回： 一个布尔值，表示此流中的所有核心是否完成。    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### record_event(event=None)\n",
    "记录一个事件\n",
    "参数：event(Event, optional)：要记录的事件。如果没有给出，将分配一个新的。  \n",
    "返回：记录的事件\n",
    "\n",
    "### synchronize()\n",
    "等待此流中的所有核心完成。\n",
    "\n",
    "### wait_event(event)\n",
    "将所有未来的工作提交到流等待事件。\n",
    "参数：event(Event)：等待的事件\n",
    "\n",
    "### wait_stream(stream)\n",
    "与另一个流同步，提交到此流的所有未来工作将等待直到所有核心在调用完成时提交给给定的流"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### class torch.cuda.Event(enable_timing=False, blocking=False, interprocess=False,  _handle=None) \n",
    "CUDA事件的包装。\n",
    "参数： \n",
    "- enable_timing(bool)：指示事件是否应该测量时间（默认值：False） \n",
    "- blocking(bool)：如果为true，wait()将被阻塞（默认值：False） \n",
    "- interprocess(bool)：如果为true， 则可以在进程之间共享事件（默认值：False）\n",
    "\n",
    "### elapsed_time(end_event)\n",
    "返回事件记录之前经过的时间。\n",
    "\n",
    "### ipc_handle()\n",
    "返回此事件的IPC句柄。\n",
    "\n",
    "### query()\n",
    "检查事件是否已被记录。  \n",
    "返回： 一个布尔值，指示事件是否已被记录。\n",
    "\n",
    "### record(stream=None)\n",
    "记录给定流的事件。\n",
    "\n",
    "### synchronize()\n",
    "与事件同步。\n",
    "\n",
    "### wait(stream=None)\n",
    "使给定的流等待事件。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.2"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
